News Archive

CloudBank Resources Raise Data Science Education Programs to New Heights

UC Berkeley and UC San Diego provide computational resources to six California community colleges

Published June 21, 2023

UC Berkeley’s Vice Chancellor for Research Kathy Yelick is co-principal investigator for the SDSC-led CloudBank.  Credit: UC Berkeley

By Kimberly Mann Bruch

Because computational needs often fluctuate throughout university-level courses, one of CloudBank’s goals has been to facilitate the use of public cloud resources in the classroom. Simultaneously, UC Berkeley’s College of Computing, Data Science and Society developed the Berkeley Data Stack—an auto-scaling, Jupyter hub-based learning platform that scales up when assignments are due and then scales back down in between assignments. 

As a CloudBank partner, UC Berkeley wanted to share this platform with instructors at community colleges that may not have the local resources to set up and support a similar system. Their goal was to reach approximately 500 students among six community colleges. Currently, Berkeley’s Data Stack reaches 300-400 students through the partnership with the San Diego Supercomputer Center (SDSC) at UC San Diego. 

Led by Co-Principal Investigator Shava Smallen, the CloudBank team at SDSC provided administrative support for the project. Smallen said she worked with Berkeley’s Kathy Yelick, Eric Van Dusen and Sean Morris as they deployed their educational teaching stack—including  a Jupyter hub, a set of labs and notebooks, interactive links for accessing the content and an autograding solution.  

“CloudBank makes it very easy for new learners to start computing on their own as all of the course materials are accessed through a browser window with no installation,” said Van Dusen, outreach lead for Berkeley’s College of Computing, Data Science and Society. “Having a CloudBank Juptyer hub is helping to put community college education on the same foundation as UC Berkeley with the same infrastructure for interactive computing.”

Van Dusen said that the team has worked with a non-profit infrastructure organization called 2i2c for implementing the pilot at the following community colleges:

  • El Camino Community College, Los Angeles
  • Santa Barbara Community College, Santa Barbara
  • City College of San Francisco, San Francisco
  • Palomar Community College, San Diego
  • Skyline Community College, San Mateo
  • San Jose Community College, San Jose

“The use of CloudBank allows us to use larger datasets than we would be able to use with our local machines,” said Peter Chen, Palomar College instructor. “We can experiment with various data tools that aren’t realistic without the use of cloud computing. We are really happy with this partnership and thank the team at Berkeley and SDSC for the opportunity.”

The UC Berkeley platform was initially developed for the popular Data 8 course on the campus and quickly expanded to many other courses at UC Berkeley, using local resources to support 10,000 students per month. 

CloudBank is an integrated service provider that functions to broaden access and impact of cloud computing across the many fields of computer science research and education. It is supported through an award funded to UC San Diego, UC Berkeley and the University of Washington by the National Science Foundation (grant no. 1925001).